Picture for Shuyue Stella Li

Shuyue Stella Li

Privasis: Synthesizing the Largest "Public" Private Dataset from Scratch

Add code
Feb 03, 2026
Viaarxiv icon

Collaborative Multi-Agent Test-Time Reinforcement Learning for Reasoning

Add code
Jan 15, 2026
Viaarxiv icon

Olmo 3

Add code
Dec 15, 2025
Viaarxiv icon

RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments

Add code
Nov 10, 2025
Figure 1 for RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments
Figure 2 for RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments
Figure 3 for RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments
Figure 4 for RLVE: Scaling Up Reinforcement Learning for Language Models with Adaptive Verifiable Environments
Viaarxiv icon

PrefPalette: Personalized Preference Modeling with Latent Attributes

Add code
Jul 17, 2025
Figure 1 for PrefPalette: Personalized Preference Modeling with Latent Attributes
Figure 2 for PrefPalette: Personalized Preference Modeling with Latent Attributes
Figure 3 for PrefPalette: Personalized Preference Modeling with Latent Attributes
Figure 4 for PrefPalette: Personalized Preference Modeling with Latent Attributes
Viaarxiv icon

Spurious Rewards: Rethinking Training Signals in RLVR

Add code
Jun 12, 2025
Figure 1 for Spurious Rewards: Rethinking Training Signals in RLVR
Figure 2 for Spurious Rewards: Rethinking Training Signals in RLVR
Figure 3 for Spurious Rewards: Rethinking Training Signals in RLVR
Figure 4 for Spurious Rewards: Rethinking Training Signals in RLVR
Viaarxiv icon

Precise Information Control in Long-Form Text Generation

Add code
Jun 06, 2025
Viaarxiv icon

BehaviorSFT: Behavioral Token Conditioning for Clinical Agents Across the Proactivity Spectrum

Add code
May 27, 2025
Figure 1 for BehaviorSFT: Behavioral Token Conditioning for Clinical Agents Across the Proactivity Spectrum
Figure 2 for BehaviorSFT: Behavioral Token Conditioning for Clinical Agents Across the Proactivity Spectrum
Figure 3 for BehaviorSFT: Behavioral Token Conditioning for Clinical Agents Across the Proactivity Spectrum
Figure 4 for BehaviorSFT: Behavioral Token Conditioning for Clinical Agents Across the Proactivity Spectrum
Viaarxiv icon

BLAB: Brutally Long Audio Bench

Add code
May 05, 2025
Viaarxiv icon

A False Sense of Privacy: Evaluating Textual Data Sanitization Beyond Surface-level Privacy Leakage

Add code
Apr 28, 2025
Viaarxiv icon